13 research outputs found
Unsupervised 3D Pose Estimation with Geometric Self-Supervision
We present an unsupervised learning approach to recover 3D human pose from 2D
skeletal joints extracted from a single image. Our method does not require any
multi-view image data, 3D skeletons, correspondences between 2D-3D points, or
use previously learned 3D priors during training. A lifting network accepts 2D
landmarks as inputs and generates a corresponding 3D skeleton estimate. During
training, the recovered 3D skeleton is reprojected on random camera viewpoints
to generate new "synthetic" 2D poses. By lifting the synthetic 2D poses back to
3D and re-projecting them in the original camera view, we can define
self-consistency loss both in 3D and in 2D. The training can thus be self
supervised by exploiting the geometric self-consistency of the
lift-reproject-lift process. We show that self-consistency alone is not
sufficient to generate realistic skeletons, however adding a 2D pose
discriminator enables the lifter to output valid 3D poses. Additionally, to
learn from 2D poses "in the wild", we train an unsupervised 2D domain adapter
network to allow for an expansion of 2D data. This improves results and
demonstrates the usefulness of 2D pose data for unsupervised 3D lifting.
Results on Human3.6M dataset for 3D human pose estimation demonstrate that our
approach improves upon the previous unsupervised methods by 30% and outperforms
many weakly supervised approaches that explicitly use 3D data
Steepest Descent For Efficient Covariance Tracking ∗
Recent research has advocated the use of a covariance matrix of image features for tracking objects instead of the conventional histogram object representation models used in popular algorithms. In this paper we extend the covariance tracker and propose efficient algorithms with an emphasis on both improving the tracking accuracy and reducing the execution time. The algorithms are compared to a baseline covariance tracker and the popular histogram-based mean shift tracker. Quantitative evaluations on a publicly available dataset demonstrate the efficacy of the presented methods. Our algorithms obtain significant speedups factors up to 330 while reducing the tracking errors by 86 − 90 % relative to the baseline approach. 1
Extraction of person silhouettes from surveillance imagery using MRFs
We present a method for the simultaneous detection and segmentation of objects from static images. We employ lowlevel contour features that enable us to learn the coarse object shape using a simple training phase requiring no manual segmentation. Based on the observation that most interesting objects (e.g., people) have regular and closed boundaries, we exploit relations between these features to extract mid-level cues, such as continuity and closure. For segmentation, we employ a Markov Random Field that combines these cues with information learned from training. The algorithm is evaluated for extracting person silhouettes from surveillance images, and quantitative results are presented. 1